-
Notifications
You must be signed in to change notification settings - Fork 6.3k
8369190: JavaFrameAnchor on AArch64 has unnecessary barriers and wrong store order in MacroAssembler #27645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Justin King <[email protected]>
👋 Welcome back jcking! A progress list of the required criteria for merging this PR into |
@jcking This change is no longer ready for integration - check the PR body for details. |
Signed-off-by: Justin King <[email protected]>
Webrevs
|
Signed-off-by: Justin King <[email protected]>
…bler Signed-off-by: Justin King <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Add a comment to the effect that we don't need a fence because the profiler always reads from the same thread, and we're done.
Signed-off-by: Justin King <[email protected]>
Done. Let me know if you had something else in mind where you wanted the comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
Although in the end we don't need a fence, you prompted me to measure the difference between a releasing store (STLR) and a DMB ST; STR and to my surprise STLR is way faster, at least on Apple hardware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, approval rescinded.
// unless the value is changing | ||
// | ||
// No fencing required, the members are declared volatile so the compiler will not reorder and | ||
// the profiler always reads from the same thread and should observe the state in program order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think copy() is ever called. Can we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think copy() is ever called. Can we remove it?
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think copy() is ever called. Can we remove it?
Sure.
Ah no. copy()
is called after the anchor is constructed.
Signed-off-by: Justin King <[email protected]>
Signed-off-by: Justin King <[email protected]>
// Complier barrier which prevents the compiler from reordering loads and stores. | ||
// It does not prevent the hardware from doing so. Typically you should use | ||
// OrderAccess instead. | ||
static inline void compiler_barrier() { | ||
__asm__ volatile ("" : : : "memory"); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do this in portable C++ since C++11:
// Complier barrier which prevents the compiler from reordering loads and stores.
static inline void compiler_barrier() {
std::atomic_signal_fence(memory_order_seq_cst);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some rules about not calling the Standard C++ libraries in the Guidelines, but given that this one only prevents the compiler from moving things around and does not generate any code, I don't think that really applies. More legalisticaily-minded people might disagree, but I prefer portable code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does require including <atomic>
, which seems not ideal in globalDefinitions. Additionally <atomic>
will likely eventually include <stdatomic.h>
which defines memory_order
as well. So I think we will have to live with what is currently here.
Remove unnecessary release barriers in
JavaFrameAnchor::{copy,clear}
and fixMacroAssembler::set_last_Java_frame
to setsp
last as expected by the profiler.Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27645/head:pull/27645
$ git checkout pull/27645
Update a local copy of the PR:
$ git checkout pull/27645
$ git pull https://git.openjdk.org/jdk.git pull/27645/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27645
View PR using the GUI difftool:
$ git pr show -t 27645
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27645.diff
Using Webrev
Link to Webrev Comment